More general structure for loop parsing #91

dop-amin · 2024-10-11T14:04:13Z

This PR adds a more generic strucutre to deal with loops: Subclasses of Loop implement a specific type of loop, for exmaple having a certain sequence of instructions at the end. The extraction works the same and is thus implemented in Loop, while the methods to produce the code differ.

I am open to suggestions for how to improve this approach; for my usecases wrt. armv7m it worked fine like this.

slothy/core/slothy.py

hanno-becker · 2024-10-15T03:17:54Z

slothy/targets/aarch64/aarch64_neon.py

-                    reg1 = p.group("reg1")
-                    imm = p.group("imm")
-                    state = 2
+                    if additional_data is None:


Can you say what the intended logic is here? Are you expecting this to be set by the first end-regexp?

Yes, the first end-regexp sets additional_data. I made this choice because that's usually the subs or cmp in which we can see the counter register and/or decrement.
I realize that maybe I want to get additional_data for each instruction in the "end"-part and then merge the dictionaries in the end. I'll change that.

hanno-becker · 2024-10-15T03:21:59Z

slothy/targets/arm_v81m/arch_v81m.py

+        for loop_type in Loop.__subclasses__():
+            try:
+                l = loop_type(lbl)
+                return l._extract(source, lbl) + (l,)


The ... + (l,) may merit a comment. What's happening here? While you are at it, could you document _extract() the return tuple given by _extract() in a brief docstring?

Will do. (l,) transforms l into a tuple and merges it with the return-tuple from _extract.

hanno-becker · 2024-10-15T03:22:34Z

slothy/targets/arm_v81m/arch_v81m.py

+        raise FatalParsingException(f"Couldn't identify loop {lbl}")
+
+class LeLoop(Loop):
+


Nit: Add a docstring giving an example for the type of loop recognized here?

slothy/targets/aarch64/aarch64_neon.py

slothy/targets/arm_v81m/arch_v81m.py

hanno-becker

Thank you @dop-amin, overall this looks good -- better flexibility for loop forms has been an embarrassing gap for a while.

While you're at it, could you improve the doc a bit, and hoist the abstract Loop class somewhere where it can be shared between architecture models?

Also, would you mind adding minimal examples to examples.py which demonstrate each loop form, so we run them in CI?

slothy/core/slothy.py

slothy/targets/aarch64/aarch64_neon.py

dop-amin · 2024-11-21T10:48:03Z

@hanno-becker I think the PR now is a good starting point for abstraction of loop handling. However, I think there may pop up new cases in the future which will require slight tweaks, e.g., passing more/different inputs to the loop subclasses. I already pass some data where I know it will be useful from our experiments with Armv7m, esp. for more complicated loop constructions that check against a pointer that is modified inside the kernel.

Are there any tests you'd like me to run except CI? I fully optimized one aarch64 example already and the output code still passes the test.

* Add file for common code between models

dop-amin · 2024-11-27T10:07:40Z

Thank you @dop-amin, overall this looks good -- better flexibility for loop forms has been an embarrassing gap for a while.

While you're at it, could you improve the doc a bit, and hoist the abstract Loop class somewhere where it can be shared between architecture models?

Also, would you mind adding minimal examples to examples.py which demonstrate each loop form, so we run them in CI?

I think this should all be done by now.

hanno-becker · 2024-11-29T19:35:42Z

slothy/targets/arm_v81m/arch_v81m.py

+    ```
+           loop_lbl:
+               {code}
+               le <cnt>, loop_lbl


The implementation is assuming that cnt increases by 1 per iteration, right?

This can be broken through SW pipelining, or not, if the loop count increment is chosen as an early instruction?

No, my fault -- this is already part of LE, I had forgotten.

hanno-becker · 2024-11-29T19:45:19Z

slothy/targets/aarch64/aarch64_neon.py

-        """Locate a loop with start label `lbl` in `source`.
-
-        We currently only support the following loop forms:
+        yield f"{indent}sub {other['cnt']}, {other['cnt']}, {other['imm']}"


This should use the same format that the original loop had? With/without flag, and potentially using cbnz, bnz, bne?

This is pre-existing, so let's not block the PR because of it.

dop-amin requested a review from hanno-becker October 11, 2024 14:04

hanno-becker reviewed Oct 11, 2024

View reviewed changes

slothy/core/slothy.py Show resolved Hide resolved

hanno-becker reviewed Oct 15, 2024

View reviewed changes

slothy/targets/aarch64/aarch64_neon.py Show resolved Hide resolved

hanno-becker reviewed Oct 15, 2024

View reviewed changes

slothy/targets/arm_v81m/arch_v81m.py Outdated Show resolved Hide resolved

hanno-becker requested changes Oct 15, 2024

View reviewed changes

hanno-becker reviewed Oct 16, 2024

View reviewed changes

slothy/core/slothy.py Outdated Show resolved Hide resolved

hanno-becker reviewed Oct 16, 2024

View reviewed changes

slothy/targets/aarch64/aarch64_neon.py Outdated Show resolved Hide resolved

dop-amin marked this pull request as ready for review November 21, 2024 10:48

dop-amin mentioned this pull request Nov 22, 2024

Armv7m modeling #103

Merged

hanno-becker self-requested a review November 25, 2024 06:38

dop-amin added 8 commits November 27, 2024 11:01

More general structure for loop parsing

b1a4ab5

Move Loop class

466e3f9

* Add file for common code between models

Improve documentation for loop abstraction

a5a4ce6

Add examples for loop types

f276e8c

Add option to give factor for loops with non-one in/decrements

2a900db

Switch strategy for loop counter offset modification

4305896

Make additional loop data part of class

6461088

Loop parsing improvments

899328b

dop-amin force-pushed the flexible_loops branch from cf17c29 to 899328b Compare November 27, 2024 10:05

hanno-becker reviewed Nov 29, 2024

View reviewed changes

hanno-becker approved these changes Nov 29, 2024

View reviewed changes

hanno-becker merged commit 90a78c0 into slothy-optimizer:main Nov 29, 2024
24 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More general structure for loop parsing #91

More general structure for loop parsing #91

dop-amin commented Oct 11, 2024

hanno-becker Oct 15, 2024

dop-amin Oct 15, 2024

hanno-becker Oct 15, 2024

dop-amin Oct 15, 2024

hanno-becker Oct 15, 2024

hanno-becker left a comment

dop-amin commented Nov 21, 2024 •

edited

Loading

dop-amin commented Nov 27, 2024

hanno-becker Nov 29, 2024

hanno-becker Nov 29, 2024

hanno-becker Nov 29, 2024

hanno-becker Nov 29, 2024

hanno-becker Nov 29, 2024

		raise FatalParsingException(f"Couldn't identify loop {lbl}")

		class LeLoop(Loop):

More general structure for loop parsing #91

More general structure for loop parsing #91

Conversation

dop-amin commented Oct 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanno-becker left a comment

Choose a reason for hiding this comment

dop-amin commented Nov 21, 2024 • edited Loading

dop-amin commented Nov 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dop-amin commented Nov 21, 2024 •

edited

Loading